TACKLING IMBALANCED CLASS IN SOFTWARE DEFECT PREDICTION USING TWO-STEP CLUSTER BASED RANDOM UNDERSAMPLING AND STACKING TECHNIQUE

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Software Defect Prediction for High-Dimensional and Class-Imbalanced Data

Software quality and reliability can be improved using various techniques during the software development process. One effective method is to utilize software metrics and defect data collected during the software development life cycle and build defect predictors using data mining techniques to estimate the quality of target program modules. Such a strategy allows practitioners to intelligently...

متن کامل

Learning from Imbalanced Data Using Ensemble Methods and Cluster-Based Undersampling

Imbalanced data, where the number of instances of one class is much higher than the others, are frequent in many domains such as fraud detection, telecommunications management, oil spill detection and text classification. Traditional classifiers do not perform well when considering data that are susceptible to both within-class and between-class imbalances. In this paper, we propose the ClustFi...

متن کامل

A Novel Approach for Handling Imbalanced Data in Medical Diagnosis using Undersampling Technique

In many data mining applications the imbalanced learning problem is becoming ubiquitous nowadays. When the data sets have an unequal distribution of samples among classes, then these data sets are known as imbalanced data sets. When such highly imbalanced data sets are given to any classifier, then classifier may misclassify the rare samples from the minority class. To deal with such type of im...

متن کامل

Cluster-Based Image Segmentation Using Fuzzy Markov Random Field

Image segmentation is an important task in image processing and computer vision which attract many researchers attention. There are a couple of information sets pixels in an image: statistical and structural information which refer to the feature value of pixel data and local correlation of pixel data, respectively. Markov random field (MRF) is a tool for modeling statistical and structural inf...

متن کامل

Stacking Class Probabilities Obtained from View-Based Cluster Ensembles

In pattern recognition applications with high number of input features and insufficient number of samples, the curse of dimensionality can be overcome by extracting features from smaller feature subsets. The domain knowledge, for example, can be used to group some of the features together, which are also known as “views”. The features extracted from views can later be combined (i.e. stacking) t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Jurnal Teknologi

سال: 2017

ISSN: 2180-3722,0127-9696

DOI: 10.11113/jt.v79.11874